Fast Ranking in Limited Space
نویسندگان
چکیده
Ranking techniques have long been suggested as alternatives to more conventional Boolean methods for searching document collections. The cost of computing a ranking is, however, greater than the cost of performing a Boolean search, in terms of both memory space and processing time. Here we consider the resources required by the cosine method of ranking, and show that with a careful application of indexing and selection techniques both the space and time required by ranking can be substantially reduced. The methods described in this paper have been used to build a retrieval system for a collection of over two million pages of text, with which it is possible to process ranked queries of 50{60 terms in about 5% of the space required by previous implementations; in as little as 25% of the time; and with no loss of retrieval eeectiveness.
منابع مشابه
Fast Voltage and Power Flow Contingency Ranking Using Enhanced Radial Basis Function Neural Network
Deregulation of power system in recent years has changed static security assessment to the major concerns for which fast and accurate evaluation methodology is needed. Contingencies related to voltage violations and power line overloading have been responsible for power system collapse. This paper presents an enhanced radial basis function neural network (RBFNN) approach for on-line ranking of ...
متن کاملFast Unsupervised Automobile Insurance Fraud Detection Based on Spectral Ranking of Anomalies
Collecting insurance fraud samples is costly and if performed manually is very time consuming. This issue suggests usage of unsupervised models. One of the accurate methods in this regards is Spectral Ranking of Anomalies (SRA) that is shown to work better than other methods for auto insurance fraud detection specifically. However, this approach is not scalable to large samples and is not appro...
متن کاملModified particle swarm optimization algorithm to solve location problems on urban transportation networks (Case study: Locating traffic police kiosks)
Nowadays, traffic congestion is a big problem in metropolises all around the world. Traffic problems rise with the rise of population and slow growth of urban transportation systems. Car accidents or population concentration in particular places due to urban events can cause traffic congestions. Such traffic problems require the direct involvement of the traffic police, and it is urgent for the...
متن کاملOnline Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features
Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...
متن کاملEvaluating the efficiency and ranking of west Guilan municipalities of urban services section using data envelopment analysis (DEA)
Municipalities as well as any other organization have needed assessment and efficiency measurement to make better use of their limited resources and greater effectiveness. The aim of this study is to evaluate the efficiency of municipalities, determining efficient and inefficient municipalities using data envelopment analysis, and classifying the municipalities using Anderson Peterson technique...
متن کامل